Determining the Origin and Structure of Person Names

نویسندگان

  • Yu Fu
  • Feiyu Xu
  • Hans Uszkoreit
چکیده

This paper presents a novel system HENNA (Hybrid Person Name Analyzer) for identifying language origin and analyzing linguistic structures of person names. We conduct ME-based classification methods for the language origin identification and achieve very promising performance. We will show that word-internal character sequences provide surprisingly strong evidence for predicting the language origin of person names. Our approach is context-, languageand domain-independent and can thus be easily adapted to person names in or from other languages. Furthermore, we provide a novel strategy to handle origin ambiguities or multiple origins in a name. HENNA also provides a person name parser for the analysis of linguistic and knowledge structures of person names. All the knowledge about a person name in HENNA is modelled in a person-name ontology, including relationships between language origins, linguistic features and grammars of person names of a specific language and interpretation of name elements. The approaches presented here are useful extensions of the named entity recognition task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pomegranate: Biodiversity and genetic resources, a review

Pomegranate (Punica granatum L.) is a multipurpose plant that is important as nutrition, medical, horticulture, landscape and environment. The plant has been cultivated since old Iran and Egypt which is mentioned in Bible and Quran. Origin of the plant is Iran and some neighbour countries, although it is cultivated in many countries, now a day. Botanically, it is classified in Punicaceae and ha...

متن کامل

Review on the Symptoms, Transmission, Therapeutics Options and Control the Spread of the Disease of COVID-19

An unprecedented outbreak of pneumonia in Wuhan City, Hubei province in China emerged in December of 2019. COVID-19 is a betacoronavirus which consisted of a single-stranded ribonucleic acid (RNA) structure that belongs to the Coronavirinae subfamily. With respect to the large number of infected people that were exposed to the alive animals (bats, snakes, pangolins) in Wuhan City, China, it is ...

متن کامل

بررسی ساحت ذات احدیت از منظر عرفان نظری.

The stage of the Unity essence is the first stage of manifestation and determination of the essence of the “Unseen of the Unseens” in which the names and entities have conceptual and denotation unity with each other and with the Essence. Perception of the unity essence in mystics language is usually uttered as the stage of manifestation of presence of ipseity and the Unseen of the Unseens and t...

متن کامل

بررسی عوامل سازنده ابهام در مقالات شمس با تأکید بر مسئله انسجام دستوری

Maqālāt Shams as an eminent and significant work in the history of mysticism and Persian literature, which has a close relation with Rumi’s life and work, not only ignord by common readers but also research scholars have not paid attention to it properly. One of the main reasons of this, seems to be scattered sentences and lack of apparent firmness of the text, which have caused it appears a...

متن کامل

The Persian Nights Vs. The Arabian Nights

This paper explores the possible origins of some names in 1001 Nights. The names of the major characters of the Night stories, and their borrowed reflexes in Arabic, have been traced back to their ancient Persian roots. Examples from classical works are brought to argue that Shahrāzād, Shahriyār, and Dīnāzād are not only the correct forms but more suitable to the deep structure of the frame-sto...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010